Skip to content

Conversation

@jamwaffles
Copy link
Collaborator

Apologies to @techmccat in #27, it seems you've nerd-sniped me into making my own fast draw implementation.

Would you be able to test it on your device and report back any performance changes you see? I don't have a good way of benchmarking on my end and you seem to be able to make comparisons as in #27.

@techmccat
Copy link

It's showing artifacts on my end, I think it has to do with how the mask is computed.

@jamwaffles
Copy link
Collaborator Author

Thanks for testing. Would you be able to post your test code here so I can reproduce? A photo of what you're seeing on the display would be great as well if you can be bothered. I thought I had this impl pretty dialed in but nope!

@techmccat
Copy link

The code is going to be a little annoying to upload since it's scattered across a bunch of folders, but I might be able to reproduce it with a smaller test case.
Here's a picture of the artifacts
artifacts

@jamwaffles
Copy link
Collaborator Author

Thanks for the photo nonetheless. I can probably work with just the drawing code and put it into my own harness if that makes it any easier to post?

@techmccat
Copy link

techmccat commented Mar 22, 2022

The code draws quadtrees and is like this

#[derive(Debug, PartialEq, Clone)]
pub struct Leaf {
    pub feature: bool,
    pos: Vec<u8, 7>,
}

impl Dimensions for Leaf {
    fn bounding_box(&self) -> Rectangle {
        let mut s = 128;
        let mut x = 0;
        let mut y = 0;

        for p in &self.pos {
            s /= 2;
            match p {
                0 => (),
                1 => x += s,
                2 => y += s,
                3 => {
                    x += s;
                    y += s
                }
                _ => (),
            }
        }

        let point = Point::new(x as i32, y as i32);
        let size = Size::new_equal(s);

        Rectangle::new(point, size)
    }
}

impl Drawable for Leaf {
    type Color = BinaryColor;
    type Output = ();

    fn draw<DT>(&self, target: &mut DT) -> Result<Self::Output, DT::Error>
    where
        DT: DrawTarget<Color = Self::Color>,
    {
        let Rectangle {
            top_left: point,
            size,
        } = self.bounding_box();
        let rect = if self.pos.len() == 0 {
            Rectangle::new(point, size).intersection(&target.bounding_box())
        } else {
            Rectangle::new(point, size)
        };
        target.fill_solid(&rect, self.feature.into())
    }
}

Drawing a bunch of squares with 1 to 4 px long sides should be a good enough test case, here's the data from the lib test harness

[
        Leaf {
            pos: Vec::from_slice(&[0, 3]).unwrap(),
            feature: true,
        },
        Leaf {
            pos: Vec::from_slice(&[0, 2, 3]).unwrap(),
            feature: true,
        },
        Leaf {
            pos: Vec::from_slice(&[0, 2, 2, 3]).unwrap(),
            feature: true,
        },
        Leaf {
            pos: Vec::from_slice(&[0, 2, 2, 2, 3]).unwrap(),
            feature: true,
        },
        Leaf {
            pos: Vec::from_slice(&[0, 2, 2, 2, 2, 3]).unwrap(),
            feature: true,
        },
        Leaf {
            pos: Vec::from_slice(&[0, 2, 2, 2, 2, 2, 3]).unwrap(),
            feature: true,
        },
    ]

@techmccat
Copy link

Update: I can't reproduce the bug with the test rectangles I've provided, nor can I figure out why it's happening.

@techmccat
Copy link

techmccat commented Mar 22, 2022

Managed to isolate the bug to the if block at line 250 (falling back to draw_contiguous doesn't show any artifacts), I'll look into it more tomorrow.

@jamwaffles
Copy link
Collaborator Author

Hey @techmccat, I came up with an interesting solution using bit shifts in a u128 while I was drifting off to sleep last night. I've done some testing on my display and behaviour appears correct, but I'd very much appreciate it if you could run your benchmark with this new algorithm, as well as test for any artifacts in your application.

u128s don't optimise too well for thumbv6m, at least according to godbolt.org, but do sort of ok with thumbv7m. I'm hoping that even suboptimal assembly using u128s are still faster than previous algorithms - but your benchmark will show either way I hope. I'd like to see some hard numbers before I optimise further.

@techmccat
Copy link

Surprisingly enough the artifacts seem to be gone, but it's a tad slower than the previous version (was 4-5ms and now it's 6-7ms)

@jamwaffles
Copy link
Collaborator Author

Thanks for testing. There's definitely some more optimisations to be made. So I don't have to ask you to benchmark every change, do you have a link to a full repository I can use to test improvements against?

I'm not surprised it's a bit slower tbh, although I think you mentioned it used to take ~40ms to draw your test code prior to any of these changes, so that's a good improvement from baseline.

Can I ask what target you're compiling for? Is it thumbv6m, thumbv7m, etc?

@techmccat
Copy link

techmccat commented Mar 30, 2022

my target is thumbv7em (STM32F411 clocked at 50MHz).
Benchmarking code is a bit of a mess but i'll try to sort it out and upload it later today.
Edit: i uploaded the code at https://github.com/techmccat/blackpill-sh1106

@jamwaffles jamwaffles force-pushed the fast-draw-improved branch from d75d9b8 to f6df7b8 Compare April 1, 2022 20:21
@jamwaffles jamwaffles force-pushed the fast-draw-improved branch from f6df7b8 to 1b1d8cd Compare April 1, 2022 20:24
@jamwaffles
Copy link
Collaborator Author

There is still some performance left on the table with this PR, but I think it's in a good enough place to merge and iterate on in the future. If there are no objections, I'll merge this and put a crates.io release out soon after.

@jamwaffles
Copy link
Collaborator Author

Benchmark results running bench-fill on my STM32F103 Blue Pill:

branch commit time
master 2736564 156ms
fast-draw-improved 9f8c1e3 10ms

Not very scientific but it's a huge improvement when just drawing rectangles.

@jamwaffles jamwaffles force-pushed the fast-draw-improved branch from c725190 to e640e1b Compare June 25, 2022 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants